Skip to content

Conversation

@huan233usc
Copy link
Collaborator

@huan233usc huan233usc commented Oct 27, 2025

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR introduces architectural improvements to prepare for CCv2 (catalog managed table) support, no functional changes.

1. Introduce SnapshotManager Abstraction

  • Created SnapshotManager interface to encapsulate snapshot operations
  • Implemented PathBasedSnapshotManager (current path-based implementation)
  • Moved snapshot logic from StreamingHelper to PathBasedSnapshotManager
  • Added getTableChanges() method to encapsulate CommitRangeBuilder logic
  • Added getMetadata() method to avoid SnapshotImpl casting

2. Remove Direct tablePath Dependencies

Removed tablePath references from:

  • SparkScanBuilder: Now receives SnapshotManager
  • SparkScan: Derives path from ScanStateRow.getTableRoot() when needed
  • SparkBatch: Completely removed tablePath field
  • SparkMicroBatchStream: Uses SnapshotManager.getTableChanges()

This refactor allows future SnapshotManager implementations for CCv2 without relying on file system paths.

How was this patch tested?

  • All existing unit tests pass
  • Updated StreamingHelperTestPathBasedSnapshotManagerTest
  • Updated SparkScanBuilderTest and SparkMicroBatchStreamTest to use new architecture

Does this PR introduce any user-facing changes?

No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant